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A scheduling and aitltiation scheme 
in an input-buffered switch witii speedup 
detenninistb bandwidtfa and delay perfbr- 
manoe independent of switch size is pre- 
sented. Widiin die firamewoik of a crossbar 
aicfaiiectuie having a phnality of input chan- 
nels and output cfaanoels, the srhftrtiilfng and 
aiUtration sdieme detcnnliies the sequence 
of flxsd-size packet or cell tnmsoiissioiis be- 
tween file input chameb and ou^wt chan- 
nels sati^og the oonstndot tot onfy one 
ccU can leave an ij^Hit channel and enter an 
output channel per phase ta such a way tot 
the aibitfatioo deby is bounded for each cdl 
awaiting transmission at die input dianneL If 
the fixed-sized packets result finora ffragnwo- 
tation of vaxiable size packets, the scfaedoUng 
and Bibitradon scheme billies tot If delay 
guarantees are {novided at the cdl level, they 
are also provided to die initial varis^fe size 
packets, re-assemUed at die output channel, 
as welL 
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MFTffon yoR PROvm iNG delays independent of switch size in a 

rROSSB AR SW ITCH Wmi SPEEDUP 

FIFXD OF T HE INVENTION 
5 The present invention relates generally to variable and fixed size packet switches, and 

more particularly, to an apparatus and method for scheduling packet cell inputs through sudi 
packet switches. 

BACKGROUND OF THE INVENTION 
In the field of Int^rated Services Networks, the importance of maintaining Quality of 

10 Service (QoS) for individual traffic streams (or flows) is generally recognized. Thus, such 
cq)8bi% continues to be die subject of much rcseardh and Of particular mtmst 

is the delay experienced by an individual packet or cell. Good delay poiformance must be 
provided to all flows abiding to flidr sovioe contxact n^otiated at connection setiqi, evai in the 
presence of other potentially misbehaved flows. Many different methods have been developed 

15 to provide such performance in non-blockmg switch architectures such as output buffered or 
shared memory switches. Sevoal algorithms providing a wide range of delay guarantees for non- 
blocking architectures have been disclosed in the literature. See, for example, A. Parekh, "A 
Generalized Processor Sharing Appioadi to Flow Control in Integrated Services Networks", 
MIT. Ph.D dissertation, June 1994; J. Bennett and H. Zhang, "WF2Q - Worst-case Fan- Weighted 

20 Fair Queuing'', Ptoc. IEEE INFOCOM'96; D. Stiliadis and A. Vanna, "Rrame-Based Fair 
Queuing: A New Traffic Scheduling Algoritoi for Packet Switch Networks'*. Proc. IEEE 
INFOCOM *96; L. Zhang, "A New Ardiitecture for Packet Switched Network Ptotocols," 
Massachusetts Institute of Technology, PkD Dissertation, July 1989; A. Chamy, "Hierarchical 
Relative Error Scheduler An Efficient Traffic Shaper for Packet Switching Networks," Proc. 

25 NOSSDAV *97, May 1997, pp. 283-294; and otbeis. Schedulers cqwble of providing bandwidth 
and delay guarantees in non-blockii^ architectures are commonly refbied to as **QoS-cq)able 
schedulers". 

Typically, output-buffered or shared memory non-bloddng architectures require the 
existence of high-speed memory. For example, an output-buffered switch requires that the speed 
30 of memory at each output must be equal to the total speed of all inputs. Unfortunately, the rate 
of the increase in memory speed available with current technology has not kept pace with the 
rsq)id growth in demand for providing large-scale integrated services networks. Because there 
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is a growing demand for large switches with total input capacity on the order of tens and himdreds 

of Gb/s, building an output buffered sv4tch at this speed has^^ 

present state of technology. Similar issues arise with shared memory switches as well. 

As a result, many industrial and research architectures have adopted a more scalable 
5 q)pn)ach,fi)r example, crossbars. Details of such ardutectures may be had with reference to the 
following papas: T. Anderson, S. Owicki, J. Saxe, C. Hwcker, 'TOgJi Speed Switch Scheduling 
for Local Area Nrtworics" Proc. Fifth Intemat Conf. on Architectural SuK»rt for Programming 
Languages and Operating Systems," Oct 1992, pp. 98-1 10; and N. McKeown, M. Izzard, A. 
Mekkittikul, W. Ellersick and M, Horowitz, "The Tiny Tera: A Packet Switch Core." Even 
10 given the advances in the ait, providing QoS in an input-queued crossbar switch remains a 
significant challenge. 

A paper by N. McKeown, V. Anathaiam and J. Warland, entitled ''Achieving 100% 
Throughput in an Input-Queued Switch," Pioc. IEEE INFOCOM "96, March 1996, pp. 296-302, 
describes several algorithms based on weighted maximum bipartite matching (defined therein) 

15 and capable of providing 100% througl^iut in an input-bufifored switch. Unfortunately, the 
complexity of these algorithms is viewed as too high to be realistic for high-speed hardware 
implementations. In addition, the nature of the delay guarantees provided by these algorithms 
remains largely unknown. 

Published research by D. Stiliadis and A. Vanna, entitled "Providing Bandwidth 

20 Ouaianlees in an Input-BuiGfeied Crossbar Switch," Proc IEEE INFOCOM '95, April 1995, pp. 
96&-968, suggests that bandwidth guarantees in an input bu£fered crossbar switch may be realized 
using an algorithm lefened to as Weighted Probabilistic Iterative Matching (WPIM), vidiich is 
essentiaUy a weii^btedvei^onoftfaealgqritfam described mAnderscmetal Although the WPIM 
algorithm is more suitable for hardware implementations than that described by McKeown et al., 

25 it does not tqipear to provide bandwidth guarantees, and die delay perfonnance in g«ieral has not 
beoi understood. 

One method of providmg bandwidth and delay guaianlees m an input-bufikred crossbar 
architecture uses statically computed sdiedule tables (an example of which is described in 
Anderson et al.); however, there are several significant limitations associated with this ^JiMroach. 
30 First, the computation of schedule tables is extremely complex and time-consuming. Therefore, 
it can only be performed at connection setup-time. Addmg a new flow or changing the rates of 
the ©dsting flows is quite difficult and time-consuming, since such modification can require re- 



wo 99/35792 PCT/US99«0684 

-3- 

computation of the whole table. Without such re-computation, it is frequently impossible to 
provide deiay and even bandwidth guarantees even for 

these fable iqxiates tend to be performed less frequeiitly tiian may be desired Second, thm exists 
die necessity to constrain die supported rates to a ladier coarse rate granularity and to restrict the 
5 smallest siqiported rate in order to limit the size of die schedule table. These limitadons serve to 
substantially reduce the flexibility of providing QoS. 

The search for scalable solutions which can provide high-quality QoS has led to several 
approaches. In one approach, an algorithm allows for die emulation of a non-blocking output- 
buffered switch with an output FIFO queue by using an input-buffered crossbar with speedup 

10 independmt of the size of die switch. See B. Prabhakar and N. McKeown, "On die Speedup 
Requiied for Combined Ii^ and Ou^nit Queued S witdung," Computer Systems Lab. Technical 
Report CSL-TRr97-738, Stanford University. More specifically, diis reference shows that such 
emidation is pos^blewidi a speedup of4 and conjectures that a speedup of2niay suffice. This 
result allows one to emulate a particular instantiation of a non-blocking output-buffered 

15 architecture without having to use die speedup of die order of the switch size, i.e., speedup equal 
to die number of ports. However, this algorithm is only capable of a very limited emulation of 
an output buflfercd switch widi FIFO service. Furthermore, as described in die above-referenced 
technical report, such emulation does not provide any delay guarantees and the delay performance 
in goietal is not well understood Its equability of providing bandwiddi guarantees over a large 

20 tiinescdeofa large time scale is limited to flows which are ahsadbr shaped 

at die input to the switch, and no bandwiddi guarantees can be provided in the presence of 
misbehaved flows. 

It should be noted diat in speeded-up mpiA buffed architectures die instantaneous rate 
of data entering an ou^ut diannel may exceed the channel capacity. Therefore, buffering is 
25 required not only at the mpvts, but also at the ou^uts. Therefore, input-buffered crossbar 
switches widispeediqi ate also known as combined ir^nd/outputb Hereinafter, 
the mote conventional torn ''speededmp input-buffered crossbar** shall be used. 

Another published stwfy of speeded-up iapai buffered switches suggests tiiat input- 
buffered switches widi even small values of speedi^ may be capable of providing delays 
30 comparable to tiiose of output-buffered switches, but is silent as to the dependence of diese 
delays on the switch size widiin the framework described therein. See R. Ouerin and K. 
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Sivarajan, "Delay and Throu^ut Poformance of Speeded-i^ Input-Queuing Packet Switches," 
IBM RfisearchReportRe 20892; June 1997. 

Thus, there exists a piesmt need in the art to provide adequate delay performance to 
guaranteed flows while utilizing the scalability of a crossbar architecture with speedup 
5 independent of switch size. 

SUMMARY OF THE INVENTION 
Accordingly, it is aa object of the present invention to provide per-packet delays 
independent of the switch size in an input-bufifered switdL 
10 It is yet another object of the present invention to ensure protection to all flows against 

misbehaved flows. 

It is still yet another oliject of ±e present invention to acconunodate dynamically 
changing load and flow compositiQn vMic operating at high speed, as well as avoid fte 
imposition of artificial restrictions on supported rales. 

15 In accordance with the purposes of the present invention, as embodied and described 

herein, the above and other purposes are attained by a method of providing delay performance 
independent of a switch size in an input-buffered switch with a speedup S greater than two 
having input channels and output channels for transferring cells therebetween. The method 
includes providing, to each of the input channels, per-output-channel queues to buffer ceils 

20 awaiting transfer to the ou^ut diannels, each per-ouOwt-channel queue being associated with a 
respective input channel and output channel, and having an assigned rote and an ideal service 
assodatedtiieiewith; providmg an arbiter to control transmission of buffered cells firomiiq^ 
channels to ou^ channels, flie arbiter having a rate oomroiler to sdiedule at a given cell slot the 
queues m the input channels, the rate controller to guarantee to each queue an amount of actual 

25 service that is within fixed bounds fi»m the ideal service of the queue, fee fixed bounds each 
bdng equal to one cell For each per-Km^-diannd queue, a pair of state variables is m^ 
including a fiist and a second state variable, die first state variable coirespondmg to an ideal 
beginning time of a next cell of the per-output queue and the second state variable corresponding 
to an ideal finishing time of transmission of the next cell of the per-ou^ut queue. The method 

30 further includes mitiali2dng the first and second state variables, the first state variable being equal 
to one and the second state variable being equal to one divided by the assigned rate; initializing 
an arbiter clock counter to count svwtch phases to zero; providing a set_match set and a set 
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queues set; initializing the set match set to include an empty set and the set queues set to 
inciude aU of the paifsrninning the^ 

pairs having a smallest eligible finish time first and, for the selected pair, updating the first state 
variable widi the ideal finish time and the second state variable with an ideal begmning tune plus 

5 one divided by the assigned rate; adding the selected pair to the set_malch set; removing from tiie 
set_queues set those pairs corresponding to the same mput diannel and output channel as the 
selected pair, determining wiiether or not the set_queues set is empty. When the set_queues set 
is determined to be empty, the method includes notifying the input channels of the per-output- 
charmel qumies corresponding to those pairs added to the set_match set, incrementing the counter 

10 byoneandreturnuigtothestq>ofinitiaiizingtheset_matchandsetjqueuessete^ and ^^en the 
set_queues set is detennined to be not empty, then retummg to Ifae step of nummg the rate 
contiolier. 

The {Hesent invention achieves several hnportant goals. It provides per cell^ket del^ 
independent of tiie switch size comparable to delay guarantees associated with non-blocking 

15 output-buffered architectures, 'vviiile utilizing the scalability of a crossbar. It allows arbitrary 
assignment of rates (as long as the rates are feasible in the sense that the sum of all rates does not 
exceed the total available bandwidth at any input or any output). Additionally, it allows the 
flexibility to quickly admit new flows and change the rate assignment of existing flows. 
Moreover, it ensures protection of well-behaved flows against misbehaved flows. 

20 More specifically, simulations indicate fliat such a system is capable of providing del^s 

comparable to Aose of an output buffeted switch, with ai^ speedup of gteatw flmn or equal to 
two, and the delays observed ate independent of switch size. 

While the invention is primarily related to providing per-iwdoet/cell delays to guaranteed 
flows, it can be used in conjunction wrfli best-effort traffic as well Ifbest effort traffic is present, 

25 it is assumed that the invention as described herein is run at an absolute priority over any 
scheduling algorithm for best effort traffic. 

BRIBF DESCRI PTTON OF THE DRAWINGS 

The above objects, features and advantages of the present invention will become more 

30 apparent from the foUowing description of the embodiments of the present invention illustrated 
in the accompanjdng drawings, wherein: 
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FIG. 1 is block diagram depicting an input-buffered crossbar switch capable of utilizing 
per-^uQjulHihannel- queue scheduling and-arbitration schemes in accordance with-the-present 
invention; and 

FIG. 2 is a flow diagram illustrating a queue sdieduling and arbitration scheme for 
5 providing delays independent of the switch size in accordance with the present invention. 



DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Referring to FIG. 1 , with like reference numerals identifying like elements, there is shown 
an input-buffered crossbar switch 1 0 implementing a crossbar arbitration scheme in accordance 

10 witii the present invention. As illustrated in FIG. I, the underiying architecture of the input- 
buffered crossbar switdi 10 isispresented as annxm crossbar. Here, Vis the number of input 
channelsi(y <i<n)12and "m"isthenuniberofou^cbannelsy(y </ <iw) 14. Eachinput 
channel has one or more input ports 16, each of which correspmids to a physical input link 1 8. 
Similarly, the output channels each have one or more output ports 20, each corresponding to a 

15 physical output link 22. The input channels 12 are connected to the output channels 14 by way 
of a crossbar unit 24. It will be understood by those skilled in the art that the crossbar unit as 
depicted in FIG. 1 includes a crossbar switch fabric of known construction, the details of v^ch 
have been omitted for purposes of simplification. It is the crossbar switch fabric that is 
responsible for transferring cells between input and output channels. 

20 In the einbodunart shown, the total capacity of aUii^dtf^^ 

is assumed to be the same, although the capadty of individual links may be different 
Hereinafter, the capacity of a dngle diannel is denoted by r_c. Hie speed of the switch fithric, 
denoted by r_5>v/isassiuned to be S tunes filter than the qseed of an^ In general, the 

switch and the channel clocks are not assumed to be synchronized. The speedup values may be 

25 arbitrary (and not necessarily intega:) values in the range of /<,S<w. It is further assumed that 
the switch operates in phases of duration defined as the time needed to transmit a unit of 
dataatq>eedr_jw. Such i^iases are refened to as matching phases. In this disclosure, a unit of 
data shall be referred to as a cell Accordingly, a switch can move at most one cell from each 
input channel and at most one cell to each output channel at each matching phase. Therefore, on 

30 average, a switch with speedup 5 can move S cells from each inpvX channel and S cells to each 
output channel. At 5=/i, the switch is equivalmt to the ou^ut buffered switch. 
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Although not shown in FIG. 1, packets received on a given input link 18 are typically 
buffered at the input ports: Also, each flow to which the received packets correspoad-may be 
allocated a separate buffer or queue at the input channel. These "per-flow" queues may be 
located in an area of central memory within the input chaimel. Alternatively, flow queues may 
5 be located in a memory in the ii^t ports associated with the input channel. When the packets 
received fiom the iiqnit links are of variable length, they are fragmented into fixed>size cells. If 
the packets arriving at the switch all have a fixed length, e.g., a cell in ATM netwoiks, no 
fragmentation is required. In packet switching networks, where arriving packets are of different 
sizes, the implementation is free to choose the size of the cell as is convenient The tradeoff in 

10 the choice of tfiis size is that fte smalio: die cell, the better delays can be provided, but the fastsr 
the arbitration must be (and therefoietiie more expensive die switch). In addition, small cell 
size causes larger fingmentation oveihead. Upon arrival and after possible fiagmentation, cells 
are mapped to a corresponding flow (based on various classifiers: source address, destination 
address, protocol type, etc.). Once m^ped, the cells are placed in the appropriate "per-flow" 

15 queue. 

Associated with each guaranteed flow is some rate r _/, vAnch is typically established at 
connection setup time (e.g., via RSVP). Rates assigned to guaranteed flows can also be changed 
during a renegotiation of service parameters as allowed by the ciurent RSVP specification. It is 
assumed that the rate assignment is feasible, i.e., the sum of the rates of all flows at each input 

20 port chaimel does not exceed the capacity of this input port dumnel, and &e sum of rates of al 1 
flows across all input ports destined to a particular output port does not exceed the capacity of that 
ou^ut port If flK sum of port capacities equals the channel capacity as assumed here, flie 
feasibility of rates across all input and ou^ut ports implies the feasibility of rates across all input 
and 011^ channels. Included in the rater J"assignedtotheflowisanyoveriiead associated with 

25 packet fragmentation and re-assembly. The actual data rate negotiated at connection setup may 
therefore be lower. For networks with fixed packet size, such as ATM, however, no 
segmentation and re-assembly is requued. Thus, no overhead is present. 

As shown in FIG. 1, eadhiiqjut channel/ 12 has m virtual output queues (VOQs) or per- 
output-channel queues 26 (also referred to as per-output or virtual output queues), denoted by 

30 QOJ)y J ^ on® for each output channel j 14. In the embodiment shown in FIG. 1, the 
input channel maintains a single flow-level scheduler S J(i) 28, which needs to schedule only a 
single flow per cell time. Once scheduler S J(i) schedules some flow/, it adds the index of tiiis 
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flow / (or, alternatively, the head of the line (HOL) cell of flow J) to the tail of queue Q(ij). 
Thus, dqKaiding on the'impiementation/grv/ may contain either (^lls or cells of 

individual flows. Any known QoS-cqmble scheduler, such as those described above, can be used 
for the 5 y scheduler. 

S In anotiier variation, each input channel could maintain one flow-level scheduler S J[ij) 

for each output When the iiqjut channel i needs to transmit a cell to a given output y, it invokes 
scheduler S J(ij) to determine ^^ch flow destined to j should be chosen. Unlike the option 
described above, in which scheduler S J(i) can run at link speed, the flow-level schedulers S J(iJ) 
must be capable of choosing up to S cells per cell time as it is possible that this input may need 
10 to send a cell to the same output in all S matdiing phases of the current cell slot. Inyetanodier 
qypioach, the iiqnit can run m parallel 5^ yschedulers^oneperou^mt Each of these schedulers 
may schedule 1S/I:<S cells per cell time. When a flow is scheduled by^ ^ anmdextotfabflow 
i&2MeAXaQ(iJ), 

Also included in the input-buffered crossbar switdi 10 is an arbiter 30 as shown in FIG. 

15 1. It is the arbiter's responsibility to determine which of the input channels should be able to 
transmit a cell to particular output channels, i.e., cells from \vWch per-output-channel queues 
should be ttansmitted. It is assumed that arbiter 24 operates in matching phases. The duration 
of each phase is equal to the duration of the channel cell slot divided by the speedup iS. The goal 
of the arbiter is to compute a maximal (conflict-fiee) match between the input and ouQnit 

20 cfaamiels so Aat at most one ceU leaves any iiqnitdiannel and at 

cfaamiel during a single matching fdiase. Al&oug^ the temi **maxinud matdi'' (or, alternatively, 
'^axinw>l mgtchingf *) is well understood by those skilled in the art, a definition may be had with 
refaence to papas by N. McKeown et aL and Stiliadis et al., cited above, as well as U.S. Patent 
No. 5,517,495 to Lund etal. 

25 As explained above, during each of its matchiog phases, the ari)iter decides which input 

can send a cell to v<^ch ou^ut by computing a maximal matching between all inputs and all 
ou^nits. Hie algorithm used to compute the maximal match is described in detail in paragraphs 
tofoUow. C)nce the matcliing is completed, the art)iter notifies each input of the output to which 
it can send a cell by sending to the input channel the index of the per-output queue fiom which 

30 the cell is to be transmitted The input channel then picks a cell to send to that output channel 
and the cell is transmitted to the ou^ut channel. As shown in FIG. 1 , the arbiter 30 maintains 



wo 9905792 PCTAJS99/00684 

-9- 

for each input/output pair i j, a pair of variables {b JJ, fjj) denoted as A(ij) 32 How the arbiter 
utilizes theseinpiit/output pairs Aviilbedescribedind^ 

When an input diannel 12 receives fix>m the arbiter 24 the index of the Q(ij) 
corresponding to the ou^ut diannel 14 for the cunent matching phase, it forwards the HOL cell 
S of QflJ) (or, alternatively, the cell pomted to by the HOL pointer m Q(iJ) ) to the output channel 
J. If QOJ) is empty that is, there is no cell of a guaranteed flow in the queue, then a cell of a 
lower-priority service destined to the same output is sent instead. If there is no best effort traffic 
at this input matching phase, then no cell is sent 

Although not shown in FIG. 1 , a cell forwarded by an input channel i to an output channel 
10 /is added to a queue maintained by the ou^ut channel. A variety ofqueuing disciplines can be 
used, such as FIFO, per-iiqmt-port, or per flow. If the queue is not a simple FIFO, each ou^ut 
has an additioi'ftl schedula*, shown in FIG. 1 as ou^ sdieduler Sjf 34. This output scheduler 
detennines ^ ord^ m vi^ch cells are transmitted onto the output link fifom the output channel. 
It is assumed that any required reassembly occurs before Sj> is used, so that Sj) schedules 
1 3 packets rather than cells. 

Any known QoS-capablc scheduler such as those mentioned above can be used for the 
schedule S_o. 

Since each scheduler S Jy S_o operates independently of tiie other, the delay of an 
mdividual cell in the switch is fhe sum of the delay of this cell under its input and output 

20 schedulers SJ^aod S_o, plus the delay due to the potential arbitration conflicts. Hie delay of a 
packet segmented in cells is comprised of the delay experienced by its last cell plus the 
segmentation and re-assonbly delays. 

Still lefecring to FIG. 1, it can now be appreciated diat, with respect to each input diannel, 
each of fte queues Q(i j) contams cells (or pointers to cells) yMch have already been scheduled 

25 by 5/but which have not yet been transmitted to their destination ou^ channel with v\^di the 
VOQ is assodated due to aifaitration conflicts. The present invention undolakes the task of 
determining tiie sequoice of transmissions between iiqmt duumels and ou^ut diannels satisfying 
the mssbar constraint that only one cell can leave an input channel and enter an output channel 
per phase in such a way that the arbitration delay is bounded for each cell awaiting its 

30 transmission at the input channel. 

Now referring to FIG. 2, there is illustrated the actions of the arbiter with respect to 
scheduling the per-output-channel queues 40 in accordance with the present invention. As 
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previously indicated, the arbiter maintains a pair of variables {bjj, fJJ) or A(ij) for each 
input/output pair V*; These variables 6_y and7^y will be referred 

time, respectively. Hie starting time is the ideal beginning time of transmission of the next cell 
of the queue witii v^di the input/output pair is associated. The finish time is the ideal finishing 
S time of transmission of die nesa cell of the queue with i^ch the input/output pair is associated. 
At initial step 42, the arbiter obtains for each input/ou^ut pair ij the rate rJJ, which is the sum 
of the assigned rates of all flows going fix)m input i to output / Also, in the same step, variables 
b iJ axidfJJ are initialized (to zero and i/rjj, respectively) and a count value time is set to zero. 
As fiirther illustrated in FIG. 2, at each matching phase the arbiter computes the maximal 

10 match as follows. In step 44, the arbto initializes a Set_Match set to an empty set and a 
Set JBueues SGttos^l A(IJ). Now referring to step 46 in FIG. 2, the arbiter selects the pair i^ft/i^ 
havii^ the smallest finish time/_v among all eligible pairs, where eligible paus are defined as 
those \^^iose starting time is at or before tfiecuirent time. In stqp 48, the arbiter adds the pair 
selected in step 46 to Set^Match, updates the variables sudi that bjj=fjj andfJJ=fJJ-^J/rJJ 

15 as indicated in step 50 and, in step 52, removes from set Set_Queues all pairs corresponding to 
the input and/or output of the A(ij) selected in step 46. If there are any pairs remaining in 
SetJ2ueues (step 54^, the arbiter returns to step 46 and performs the next iteration of the 
matching process. Otherwise, the matching is complete. In step 56, for each in the match, 
the arbiter informs the input i to send to all output J, As can be seen, the A(iJ) in the match 

20 correspond to the per-ou^pwt-KJhannd queues Q(ij)fiwn which a ceUsh^ 

cunent matching phase. The arbiter then proceeds to the next matching phase, incrementing 
count time by one (step 58) and iq)dating tb& rates rJJ as necessary (s^ 60) before letuming to 
step 44. 

In an alternative ii^-bufifered switch algorithm described in a co-p«iding application 
25 yMch runs a separate version of the rate controller per input and performs arbitration using the 
scheduling times of the rate controllers, the delay bound is a fimction of the size of the switch 
(i.e., the number of input/dunnds). In contrast the arbiter oftte present invention runs a single 
rate controller across all queues regardless of the inpvA or output channels to which tbey 
correspond and uses finish times (rather dian scheduled times) as described above. Also, in the 
30 above-referenced co-pending plication, the input rate controllers which sdiedule per-output 
queues at each input are oblivious to potential arbitration conflicts. The arbitration conflicts are 
resolved at the arbiter using timestamps of the scheduling times of the input rate controllers. 
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Here, in the present invention, the rate controller which is run in the arbiter uses ideal start and 
fihislrtimes of auinput/oulpul'pairs directlymd expUcitly resol^^ 

of the operation of the rate controller. Hence, the advantage of the present invention is that the 
observed delays are independent of the size of the switch and depends only on the rate of the 
s flow. However, in the present invention the rate controUer must operate at the i^iste^ 

switch &bric \^dierBas the input channel rate-controllers in the co-pendmg qiplication need to 
operate at a slower channel speed. Likewise, the size of the input to the rate-controller in the 
present invention is nxm, whereas in the co-pending invention the input to each of the rate- 
controllers is only m. Asa result, the implementation of the co-pending invention may be less 
10 expensive, e^)ecially at high speeds. 

While the disclosed input-bufifered switch and scheduling method has been particularly 
diown and described with lefeience to die iM^^ 

skilled in tiie art that various modifications in foan and detaU may be made therein without 
departing from the scope and spirit of the invention as set forth by the claims. Accordingly, 
1 5 modifications such as those suggested above, but not limited thereto, are to be considered within 
the scope of the claims. 
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CjLAIMg 

1 . A method of providing delay performance independent of a switch size-in-an-input- 
buffered switch with a speedt^ S greater than two having ii^ut channels and output channels for 
transferring cells therebetween* the method comprismg: 
5 providing, to each of the input channels, per-output-channel queues to buffer cells 

awaiting transfer to the ou^ut channels, each p^-ou^ut-channel queue being assodated with a 
respective input channel and output channel, and having an assigned rate and an ideal service 
associated therewith; 

providing an aibiter to control transmission of buffered cells from input channels to output 
10 ftbannclg, the arbiter having a rate controller to Schedule at a given Cell slot the qi^ues in the iiqmt 
rfiannftig, the rate controller to guarantee to eadi queue an amount of actual SCTVioe that is within 
fixed bounds fit)m die ideal service of the queue, the fixed bounds each bemg equal to one cell; 

for each per-output-channel queue, maintaining a pair of state variables including a first 
and a second state variable, the first state variable correqxmdmg to an ideal beginmng time of 
15 a next cell of the per-output queue and the second state variable corresponding to an ideal 
fini shing time of transmission of die next cell of the per-output queue; 

initiahzing the first and second state variables, the first state variable being equal to one 
and the second state variable being equal to one divided by the assigned rate; 
initializing an aibiter clock counter to count switch phases to zero; 
20 providing a setjnatch set and a set jqueues set; 

initializing the set jnatch set to inchide an enqyty set and the set_^ 

of the pairs; 

running the rate controller to select fiwm the set_queues set ones of the pairs having a 
smallest eligible finish time first and, for the selected pair, updating the first state variable with 
25 the ideal finish time and the second state variable with an ideal beginning tune plus one divided 
by the assigned rate; 

adding die selected pair to tiie setjnatch set; 

removing fiom the set_queues set those pairs cortespondu^g to the same input channel and 
output chaimel as the selected pair; 
30 determining whether or not the setjqueues set is entity; 
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when the set queues set is determined to be empty, notifying the input channels of the 
per-output^^chamiel queuesconesponding 1 

the counter by one and returning to ihe step of initializing the setjnatch and set^queues sets; and 
when the set_queues set is detennined to be not empty, then letuming to the step of 
lunning the rate controller. 



wo 99/35792 



PCT/US99/00684 




SUBSrrrure SHEET (RULE 26) 



wo 99/35792 



PCT/US99A)06S4 



2/2 



40 



UPDATE RATES rJJ 

IF NECESSARY 



I 



TIME=T1ME + 1 



T 

58 



OBTAIN PER INPUT/OUTPUT 
RATES rjj SET TIME = 0: 

AND INniALIZE STATE 
VARIABLES bjj = 0 AND 
LU=1/rJ.j 



42 



SET„MATCH = {EMPTY} 
SET_QyEUES = {A(jj)} 



44 



SELECT FROM 8ET.QUEUE8 
THE A(l.» PAIR WITH 
THE SMALLEST ELIGIBLE 
FINISH TIME RRST 



46 



ADD SELECTED PAIR 
TO SET_MATCH 



I 



48 



bJ.j^J.j 
U.j*UJ+1/^J.J 



50 



REMOVE ALL A(iJ) 
CORRESPONDING 
T08AMEMPUTAN0K)R 
OUTPUT AS SELECTED PAIR 



52 




SEND INDICES OF QUEUES 
CORRESPONDING TO PAIRS 
INSET_MATCHTO 
INPUT CHANNELS 



FIG. 2 

aiBSmUTE SHEET (PULE 28) 



INTERNATIONAL SEARCH REPORT 



naUcnal AppficaHon No 

PCT/US 99/00684 



A. CLASSlRCATIOMOFaiWECTMArnffl 

IPC 6 H04L12/56 H04Q11/04 



Accofdng to 1 ntematicmai Pedert Classlfcaaon <IPC) Of to boft nadonal daasMteaUon and IPC 

a FIELDS SEARCHED 

berwmm dcxnnMniation 6«atchod (dassTioatbnsyatamtolkmsdbyciassjScattonsyritwb^ 

IPC 6 H04L H04Q 

Oocumentatbn searctwd omsr than riinimun] documentation to »ie «>dant that nich docuTn»f<8 or* indudvd In the ftoUs searched 
Etoctionic data base consuled during the intamational search (name of data base and. whete pracHcaJ. search tenns used) 



C«togMy« 


OtailonordxumentwAlndtalion. wtwfo^vnprWa^ofawraltvsrtpessasN 


Ratovanl to datm No. 


A 


6B 2 293 720 A (ROKE MANOR RESEARCH) 
3 April 1996 

see page 4, line 10 - page 6, line 24 


1 


A 


EP 0 817 436 A (NEWBRIDGE NETWORKS CORP 
; XEROX CORP (US)) 7 January 1998 
see page 6, line 49 - page 8, line 53 


1 


A 


WO 97 31460 A (HAL COMPUTER SYSTEMS INC) 
28 August 1997 

see page 5, line 26 - page 6, line 25 


1 



□ 



RotherdocunwrtB are fetedm the oonSnuation of box C. 



ID 



Pfitanl farnSy roeinbers are Ksted h amex. 



- Speciat catagortos of cted (Joctffnents : 

"A" document defHng the general etate of the art which b not 

considMed tobe of paiticuiarralewanoe 
-E* oafOerdKunentbU published on or flAar (he tntacnaflorwi 
fling date 

«L* doeunent which may throw doiiits on priority claim(s)or 
b dted to establish the pd«cation date of another 
elation or other special reason (as specfied) 

X>* document referring to an oral (Ssdosure, use, exhbition or 
oth&f means 

V document pubSshad prior to the (nlfrmational fillngdatebut 



T later dooumenl published after the irtemational fling date 
orprtortydateandnollnoomolwlhthe afsploetionM 
ded to uiderstand the prtK^ or theory undaitying the 
InwenUon 

■)rdoeuroertofpartla<arrelowmce:theciatmed inyontlon 
camotbeconaidefBd novel or eannol be considered to 
iTwofve an inventive Gtop when the document to taken atone 

-Y" docunent of particular relevance; the daimed invention 
camot be considered to Invofre an inventive etepwAienlhe 
docutnert Is combined wtth one or more other 
menls. such combination being obvious to a persenekiled 
Intheait 

'Sr document meniMr of the same patent family 



Date of the ectual cemplation of the Memotional seaich 

18 May 1999 


Oateof maifingolthe intamationai search report 

27/05/1999 


Name and maling addrecs of the ISA 

European Patent Olfioe. P.a 5818 Paterttlaan 2 
NL-2280HVRi}3w9c 
Tel. (431-70) 340-204a TX. 31 651 opo Iri. 
Fax: (+31-70) 340-3016 


Authorized officer 

Lindner, A 



fooa PCT/ISAOIO (Moood sh««9 (Jkiy IMS) 



INTERNATIONAL SEARCH REPORT 

liifufinatioii on patent tamUy members 



I latioaal Appiicatiofi Ho 

PCT/US 99/00684 



Patent document 


Publication 




Patent famfly 


Puljfication 


cited in search report 


date 




m9mtier(s) 


it&te 


6S 2293720 A 


03-04-1996 


-CA 


2158324~A 


31-03-1996 






EP 


0705007 A 


03-04-1996 






JP 


8125669 A 


17-05-1996 






US 


5734650 A 


31-03-1998 






EP 


0817431 A 


07-01-1998 






EP 


0817432 A 


07-01-1998 






EP 


0817433 A 


07-01-1998 






EP 


0817434 A 


07-01-1998 






EP 


0817435 A 


07-01-1998 






JP 


10242999 A 


11-09-1998 






JP 


10190691 A 


21-07-1998 






JP 


10190692 A 


21-07-1998 



WO 9731460 A 28-08-1997 US 5892766 A 06-04-1999 

EP 0823167 A 11-02-1998 



1^ PCrnSASIO (patoM (miy WWW4 (JUy 1892) 



